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This report explains why the Management Implications 
o: Te*m Teaching (MITT) project chose multiple linear regression and 
?i."ih analysis to analyze through-time relationships among variables, 
?.r:d why it rejected repeated-measures ana^lysia of variance (ANOVA) 
md difference scores over time. Project MITT examined governance and 
uork structures for five time periods from 1974 to 1976 in 29 
elementary schools, 15 of which had introduced tea£-teaching (or 
u-itized) methods in 1974. To analyze longitudinal changes among 
T^riables and schools, the project's statistical techniques had to 
^ ake account of . small sample size and multiple time periods; they 
• Iso had to control for pre- 1974 differences 'among the schools, 
t ranges in variables because of unitization, and differences in 
triable means u and ranges. All of thesej factors interfered with 
^^a-paxisons of unitized and nonufiitizedj schools and distorted 
relationships among the variables.' Hierarchical multiple linear 
regression solved these problems by relating variables to one another 
both over time and in order of explanata/ry power. Path analysis using 
lagged multiple linear regression helped to test postulated - 
relationships through time and explore for further relationships. 
Four appendices discuss ANOVA, difference scores, path analysis, and 
corrections used for data cyclicity.* (Author/RH) 
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Introduction 

This report originated in the ,/search r— ■ an appropriate strategy 
for analyzing through-time relationships amen- selected variables in the 
MITT (Management Implications of Team Tea-idling) study. It explains 
the rationale' for our use ot multiple linear regression and, at limes, 
path analysis as the means of sorting out through-time relationships and 
discusses some of the less fruitful approaches we ,had considered at 
first:. ; . 

MITT had collected data concerning rhe governance and work structure 
in 29 elementary schools, 16 of which implemented a multiunit form of 
organization among the teaching staff in the fall of 1974. To strengthen 
j)ur"confidence in inferences about the effects of adopting the multiunit 
organization, we collected .data in the spring of 1974, when units had not 
yet been formed, and every six months thereafter for two years. 
Thirteen of the schools adopted no such innovative structural change over 
the length of the study and served as controls matched by district 
to the experimental whenever possible (Packard et. al., 1976). 

Because of the difficulties in getting a through-time individual- 
level file together, the majority of - MITT's early analyses used only 
school level indices, although some variables existed only at the school 
level, e.g. extent of Collegial Decision Making, others had to be 
aggregated as means, e.g. extent of Classroom-related Communication. 
This immedxately put a constraint on the effective sample size for any 
analytical strategy we planned to use. 



More substantive reasons existed for employing a school level 
analysis. We had conceptualized some major variables of interest as 
properties of the organization and expected they would change over time in 
response to the anticipated school-wide installation of a multiunit 
structural organization among the faculty. Although the changes we were 
investigating" relied upon activities of individual teachers, the implementa< 
tion was to be school-wide, reflecting the behavior of most, if not all, 
teachers in the school. 

Furthermore, the schools were units of. analysis which remained 

i - 

throughout the course of the study, even though the teacher turnover gave 
us a slightly different staff composition at each wave. In fact, only 
about two-thirds/ of the faculty in all unitized and conventional schools 
at Tl were "present in the schools at T5. By using the school as the unit 
of analysis we did not have to confine our information about variables 
to that given by this two-thirds faculty cohort. 

This does not imply we eliminated the individual teacher as a unit 
of analysis; many of our. cross-sectional and longitudinal analyses used 
the teacher as tjie unit (Packard et. al . , 1976; Packard et* al . , 1978). 
This was especially true for the teacher attribute and perceptual 
variables which conceptually characterize properties of individuals' rather 
than organizations. For che school-level analyses, we aggregated many of 
these to depict mean levels of selected teacher attributes in' each school. 
(Packard, jet . al . , 1976;. 1978.) 

/ 

' / 



Certainly, the' linear regression strategy which we adopted lor 
longitudinal data analysis applied equally to the individual level, but 
because of the larger sample for analysis, problems in restricting the 
number of independent variables did not apply. At- the time we were 
selecting a longitudinal strategy, however, the individual through-time 
file did not even exist and hence was not amajor focus of our concern. 
Furthermore, some of the problems and alternative strategies we encountered 
transcend the unit of analysis as a consideration. ' : , 

Selecting the Method of Longitudinal Analysis 



■ , Our preliminary queries about across-time analyses initially centered 
upon the detection of experimental-control differences at the various ■ ' 
points in time while taking into account temporal differences in/variation 
in the dependent variable of interest. Our naivete led us to attempt to 
fit the Reapeated-Measuren ANOVA to our design but tha.t model was 
eventually deemed wholly inappropriate (See Appendix A) . 

Two other strategies struck us as viable options. for analyzing 
differences between experimentals and controls and for relating change 
in oneor more variables to change in another. One was the use of 
difference scores, created by . subtracting scores at one point from those I 
at an earlier point, which would be used in some type of correlational ' {. 
analysis or group comparison. This approach also' proved unacceptable 
(See Appendix B) . 



/ 
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We settled upon the generalized multiple linear regression. The' 
approach relates the status of a dependent variable at one point in time 
to the status of the same and/or other variables at previous points in time. 
Its use in analysis of covariance is quite amenable to the study of gain 
or loss (change) as a function of treatment. (Pelz and Lew, 1970.) In 
addition it permits the assessment of curvilinear and contingent/interactive 
relationships (Cohen and Cohen, 1975; Amick and Walberg, 1975). 

The analytical regression strategy we selected is called hierarchical 
regression analysis. A dependent variable is regressed on--several 
independent variables in a particular order. For example, if Y/were 
regressed on XI, X2, and X3, each in that order, a hierarchical analysis 
will provide essential.!/, r.h'^.o types of information. 

One is the totai t?i - a:m of variance in Y that is accounted for 
by the three variables. togetr." . R 2 . Another is the increment in the 
proportion of variance explained due to the addition of -a variable. This 
m^ans one essentially has three regression equations: Y with XI, Y with 
XI and X2, and Y with XI, X2, -and X3. The increments are the proportion 
of variance explained by XI clone, that added by X2 after XI is already 
entered, and that added by X3 after XI and X2 are already in the equation. 

i A flnal important source of information are the regiession coeffi- 
cients, indices of the unique "effect" of each independent, variable upon 
the dependent variable. The term "unique" indicates a coefficient that 
reflects the directional relationship of an X on Y controlling for the 
amount of variance in Y that the X's share with each other. The fact • 
that "sharing" occurs is partly reflected in the fact the the independent ■ 
' •• • ■ " \ . . . 
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variables are usually correlated. The coefficients are called b-weights 
when variables are expressed in raw score form and beta weights when all 
variables have been converted to standardized scores with means equal 
zero and standard deviations equal one. 

Path analysis, a method for testing hypothesized causal relation- 

■ \ 

ships with the use of the multiple regression, held some promise for 
aiding our assessment of longitudinal relationships. In some instances 
we employed this approach to test a set of carefully postulated relation- 
ships, in others we searched f^or relationships in a more exploratory 
fashion (See Appendix C) . 

Lagged Multiple /Linear Regression Analysis 

Heise (1970) presented a model for using path analysis to assess 
through-time causal relationships when one has two waves of data. 
Pelz and Lew (1970) extended his model to cover multiple waves of data. 

Heise dealt v;ith a two-wave two-variable system as diagrammed below 
the subscripts "1" and "2" indicate earlier and later points in time 
respectively. u l 
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His approach is/actually an extension of path analysis to longitudinal 
data" by means of lagged regressions. The regression estimates' are obtained 
through a set of multiple regression analyses — X2 is regressed on XI and 
Yl in that order,. Y2 is regressed on Yl and XI in that order. The standardized 
regression coefficients or betas resulting from the analyses are estimates 
of the^path coefficients and are represented as £ ? s; as with the typical 
cross-sectional path analysis, his strategy usee beta weights as the meaning- 
ful coefficients. . ..' 

Path P X2Y1 re P resen ts^the impact of variation in Y at Time 1 on 
variation in X an Time 2; p! \ represents the impact of ( variation in X . 
at Time 1 on variation in Y W Time 2. One can compare v these empirical 
betas or path coefficients" ^and infer direction and magnitude of influence, 
which may involve only XI to Y^2, only Yl to. X2,' both, or neither. 

The paths p x2xl and p Y „ n \rep resent the temporal, stability in X and 
Y respectively .\ Large positive), stability coefficients suggest not much 
happened during the interval to disturb the original [distributions; that is, 
variations apl for the most part determine variations at T2. Low 
stability coefficients would suggest that the distributions for each 
variable changed considerably between measurements. ' 

One of : the assumptions of the model is that the measurement lag between < 
Tn and Tn+1 matches closely the actual\ causal/relationship lag. Pelz and 
Lew (19 70) expanded on this assumption lith a Monte Carlo study of the model 
employing several waves of data and lags\of different lengths. i 
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They defined a population in which_Xl caused Y2 but Y.1 did not 
cause X2 and specified a causal interval of four units; they also 

specified population betas for the relationships of XI to X2 and XI "o ' 

■ / ' > 

Y2. This enabled -them to investigate the sizes of betas cne would - 
obtain if he were studying a causal lag that were either shorter or 
longer than the actual one. In addition, they also obtained betas' for 
the ^relationship of Yl on X2 to see what evidence the empirically 
discrepant lags might produce about the existence of this relationship 
whep in fact it cl£d not exist. 



Equilibrium /' 

~. .' ./ 

As a .guide for what to expect given certain levels of long-terra and 
short-term stabilities, inaccurate measurement lag, and true population 
causal relationships, the Pelz and Lew study initially showed promise of 
offering us some utility. However, because it addressed a system of 
relationships in equilibrium, uninterrupted by contrived/planned change 



or trauma (as opposed to emergent change)"^ we grew increasingly doubtful 
of the interface with the processes we were examining. \ 



*This last observation we, infer after reading both Heise's and 
yelz and Lew's discussion of the model. 



Under a system of equilibrium, the values cf variables in schools ' ' 
undisturbed by a major restructuring would fluctuate through time around 
some level, 'in a system altered by some innovation, values of variables 
m'ay increase or decrease over time eventually to settle into another 
condition of equilibrium around some new higher or lower (or perhaps 
the original) level. 

Contingent rising and falling of variables as a function of each 
other would also describe a state of equilibrium in the 'schools. If 
relationships exist across Waves between a pair of variables , one would 
expect to find increases .in one Variable followed later by increases 'in 
another (were the relationship positive) ior by decreases (inverse V 
relationship) in another. In unitized schools we expected the introduction 
" of the innovation to disrupt the equilibrium, upsetting normal fluctuations . ' 
in, and .normal contingencies amon&. the variables; after. a time, the variation 
and relationships. would settle back into' another state of equilibrium. 

The change introduced by the innovation studied by MITT did not occur 
at one point in ' time; because the units continued to exist beyond ' the : " ., ; 
point of their formal establishment in .the school, any new equilibrium ' 
l^vel would evolve in their presence; Regardless of the level around 
WhVch vaiues fluctuated >, the new equilibrium in the unitized schools would 
represent the status under Vqualitatively different, situation than in the 
nonunitized schools. ' '^" K \^ ' 

..Although the Pelz and Lew' formulation helped us determine the • • ' 
strategy and clarify some assumptions of our longitudinal analyses, the 
equilibrium aspect left us" in. doubt as Yd their models applicability^as a .'" 
guide for our analyses. Conceivably, only our control schools could' be 
considered in a general state of equilibrium, particularly during the 
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first year. A risk was 'that the basic Heise model vhich appeared 
suitable to a system in equilibrium was unsuitable for a system disrupted, 
in part, by planned school-wide change. 

Coleman (1968) developed a mathematical treatment for analyzing 
change which drew heavily on the use of multiple linear regression output' 
in equations from calculus. A basic notion in the formulation of his 
approach was; the idea of systems in equilibrium. We attempted some 
preliminary/analyses using 'one of . the models he discussed and found some 
of those results consonant with expectations he laid out. Nevertheless 
we, had reservations. about the applicability of his formulation, the 
particular model of his we selected to .use, and the proper interpretation 
uf: the results; moreover, the complicated presentation made us. doubt our 
own understanding of many of his formulations. . - > 



Stability : J ! ' - ' - • 

• Pelz and Lew also discussed considerations- of short and long-term 
stability; in a variable which are reflected , in the' autocorrelation between 
different, waves --•adjacent- wave correlations -"reflect ""short-term 
• stability,' longer discrepancies- reflect long term stability. If the' 
autocorrelation drops to- zero as the' time lag increases, then long-term , ; 
stability is low or does not exist / if if drops to some constant value, then 
it exists to some degree depending' upon the size of the correlation.. They 
-interpret long-term stability /in terms of persistent . characteristics, of" 
individuals, such as personality andl.Q. The analogy to schools Eight be " 
something .like school climate or control structure or more pervasive " 
immutable characteristics such as district wealth, school size, staff V 
characteristics. • - 



Some variables- which characterize the school may change on a 
cyclical nature. Fbr example, in our design, we would expect to find 
greater teacher turnover between than within years. Similarly, we would 
expect many of the decisions about year-round routines to be made \iti the *~' 
fall. In combination with mean trends* autocorrelations could be used 
to assess the cyclical nature of school characteristics. 

A variety of patterns could appear. The successive rise and fall- of 
the mean level of a ; variable . accompanied by a high wave-to-wave 
autocorrelation would signal the -presence of a cycle typifying most, of the 
schools. Crests in the fall of. the year would be followed, by troughs in. • 
•the. spring, troughs in the fall would be f ollowedjbylcrelts in ^ thi^piing. . 
High correlations between seasqns buV not between adjacent waves would 
also sugges't a cyclical .pattern".,. ., , : 

, ..Weak autocorrelations signal that -the" differences 
between means through time do not necessarily reflect what actually ' ' 
goes on in each school." For example; 'if the^verall means for a variable 
stay the same between two waves' but the autocorrelation is weak,' then 
we can infer that the scores for all .schools, do not tend to remain the same; 
if they did'we would expect a high' correlation. 

' • ■ 1 1 .-i 

,' , 1 . \ -* 

». Use of Lagged Multiple Linear Regression on kilT Data 

" s " ' * "" . ■ -i ■ 

MITT used regression analyses to address two- general types 6f goals: - 
one was the detection of T5 differences between unitized and nonunitized - 
schools , the other, was the assessment of lagged relationships among 
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variables. The MITT Project had data that could be analyzed at different 
levels; those of primary interest were the individual, the unit (in which 
case the analysis naturally was confined to the unitized schools), and 
the school. At the school level, the sample size, 29 at best, constrained 
the- number of independent variables we could use. We usually limited 
the number to two and, in the case of regressions using the autoregression 
term, this meant vfe generally had one independent variable of central " 



interest. 



Controlling for Pre-Unitization Difference s 

The availability of Tl data provided additional "information; about " 
variance in the dependent variable* at; some later time, Tn, and afforded 
us the opportunity for a more powerful analysis. ...Because , the two groups, 
of schools differed a.t Tl on several variables, we could, not be. sure that 
any differences detected at T5 could be attributed, to effects of uniti- 
zation. The statistical determination of unitized-nonunitize.d differences 
had to take the pre-unitization differences 'between the two sets of schools, 
into account?-. To do this. we employed ^ the hierarchical multiple regression 
approach by regressing T5 values of a dependent variable first on^ the 
fTl values of the same variable and then on a second variable* a qiummy- " 
coded -vector with, l's,for, unitized and O's for nonuni'tized schools. This 
was our unit organization variable .or, as we' called. it, EXPCON.; 1 
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Since the procedure is ti ,. regression application of the analysis 
of covarianca, our interest in the increment in the proportion of 

variance accounted for by EXPCON beyond that: explained by the T1-T5 
autocorrelation (Kerlinger and Pedhazur, 1973). m the same regression 
equation we could check for an interaction effect to determine if the 
influence of unit organization on T5 values were contingent upon Tl 
values. The absence of an interaction is necessary for use of the 
analysis of covariance; had we found significant interaction we would do 
other further analyses to assess its nature since the' presence of any main 
, effects, would have been uninterpretable. 

Lagged R elationships Among Va -_nbles. 

We. also sought to assess lagged relationships among * variety. of , 
variables in the study. , For this purpose we usually assumed that the 
most relevant, through-time ' influence on- a dependent ' variable came from " 
variation in the immediately previous, wave. Our approach examined . 
• adjacent-wave Contingences among '.the selected variables. ' The regression ' 
■analysis attempted to find 'evidence that the Tn + 1 variation 'in' a dependent " 
variable was influenced oy variation in other variables that ' occurred" at , 
Tn., , : . ■ ' • "">'"'.' .,- 

,. The Tn-W+1 autocorrelation reflected the extent to which' the level < 
of a variable at' a particular W ave came about in response to^or" at least' ' 
as some predictable function of its 'level at the ^mmediately previous. ' 
wave. A low autocorrelation .ould^ suggest- that the level of a .variable 
in , school at Tn + 1 is a function of something other than its- level at 
Tn; a high autocorrelation would "suggest- the level, 0 f a variable came' 
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about at least in partial response to its previous level and perhaps ' 
in response also to some other variab es. 

There existed two situations for which we generally chose not to include 
the autoregression term. One was tn> case in which we formally -applied 
a path analytic. procedure ' to test a model of postulated lagged relation- 
ships which specifically excluded the autoregression; including it in 
such instances would have altered the model under examination. The other 
.was, the case in which the independent variable(s) of central interest 
at Tn^ correlated strongly with the autoregression variable; Under such 
circumstances of simultaneous variation we would be unable to separate out. 

"* s 

the effects of the key independent variable; the betas for each essentially ; 

.... i * a 

would be uninterpretable because they would be showing only effects of each 
.controlling for shared variation, with the other, and, the amount Of shared 
variation controlled' would .tend to be large. : 

Finally, the Companion Study *of. the MITT Project carried out several 
regressions to determine predictors ; of success in teaming . in the 15 * 
unitized school-. The predictor variables were formulated to characterize 
schools, and hence limited the analysis to the school level; furthermore, ' 
"the -focus of the Companion Study on the unitized ' schools I -limited the 
sample size to 16 schools at best..** j . ' <■ 



*See Packard _et. _al. (1976) for a,more detailed description of the 
Companion Study. I ,, 

... **Two of. the original 16 unitized schools^ discontinued their unit » 
structure in the second year ;' andther .provided us no data concerning 
instructional interdependence. .... - % 



disappear at the sam 
however, is that the 



/ Our analytical approach assumed that whatever lagged effects 
or autocorrelations we observed were characteristic of all schools at the 
same time. Thus, ^e expected variables themselves to change at about the 
same time in all schools and contingencies among variables to arise and 

time in all schools. An alternative formulation, 
schools were out of synchrony in the changes/ that 
occurred in each. Over a particular lag, a large change in a variable 
in some schools may have been absent or in the opposite direction in\^ 
other schools; a contingency among two variables over a particular lag \^ 
in some schools may not have appeared until some later lag in other 
schools. This alternative perspective was pursued as part of the 
Companion Study (Packaird _et. jal. , 1978, Ch. 8). 

V Combining Unitized" and Noh uhitized Schools 

■: • ,. x. ■ . .. _ 

Partial .Confounding Due to Unitized-Nonunltized Differences 

The apparent^ impact of unitization in changing some of our' 
major variables like 'the number of pairs of ' Instructiohally Interdependent 
teachers (NPI) and the percent of CollegiallV-made decisions (COLL) posed 
potential' analysis problems. . Both of these variables showed a change .in 
the experimental schools- which lasted for the ^duration of the study 
following.. -the "installation 'of the units. In the second year, Collegiality 
increased slightly again and interdependence decreased slightly in the * 
unitized schools but both remained significantly above \he levels found 
nonunitized schools. ' .r *' ■ 
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When using all 29 schools together, however, a problem of amblqulty 
in relationships existed among some main school-level variables of interest. 
This can best be illustrated with the Collegia! ity (COLSUM) and Inter- 
dependence (NPI) variables. The fact that unitized schools are higher 
^ than nonunitized schools on both variables distorts the relationship 
between the two when examined across all 29 schools. 

Across all schools, COLL and NPI were positively correlated 
within any wave; but these correlations are spurious, reflecting more 
the similarity In. level on the .two variables within, experimental and 
controls than any actual relationship between' them. • .High correlations 
' of bot/h C6LL and NPI with the unitized-nonunitized classification, ' • 

(EXPCON) confounds the observed relationship between each of them; their ' 
parlances partly reflect the wide differences in mean levels between the, ' 
two types of "schools. < x • >; . 

, This contamination with EXPCON .differences also frustrates the 

^ assessment of through-time stability of a variable' because the .auto- 
correlations reflect more the stability of the set of unitized ' 
^ schools being at; a high level- and' the set of nonunitized schools being 
if a low level on the .variable. The stability coefficient reflects more *' 
She enduring categorization 6f schools as unitized or nonunitized, through " 
time.. '"' ' . 

.These differences suggested the possibility that the wa/e-to-wave / : 
relationships among the variables changed '.in, t;he experimentals . Con- 
celvably changes could occur both in the stabilities of the. variables 
and in 'the -cross-wave influence between any two different variables. The 

... ,' ' " • • . • ■ . ■" . ■ ' •" , <' ■ % 

':•../■ .i /V : - - •■'■"\ 20 J . ; 



usual approach for assessing this is to test the interaction between the 
** EXPCON variable and a covariate in their relationship to the dependent 
variable. A significant interaction implies that the process under study 
differs for experimental and controls and, therefore, that the two sets 
ot schools remain separate for ^analysis purposes. 

Another problem occurred/which had implications for such an inter- 
action analysis. For variabies like Collegiality and Interdependence, 
the nopunitized schools hatl both lower mean values and more restricted V 
.ranges /than the unitized/schools on each,,: In this type 'of circumstance > 
any differences betwe^ experimentals and controls in correlations between 
/variables or in stabilities (autocorrelations) for each variable ~nay 
actually be a function of the different .ranges 'and. mbans for the two ty £ pes 
of schools. v Indeed, what (night look like a /strong interaction may'' 1 <• ' 
actually reflect/ some curvilinear relationship 'whidh goes undetecte^ " 
because the range povered ih one type of schqql essentially starts where \ 
the other leaves off. *" 

Figup 0,-a depicts a possible case, - The circies represent '* the 
control .schools , boxes represent, the expgriinentals. If a ^.covariance - 
analysis were run on these data itj would show a significant interaction 
— the relationship, between X and T would be about; zero, for controls 

/ " ' ^ ' " ' ■ 

but positive for experimental. However, had • we a sufficient range in both 
experimental and controls on JX, the relationship between X and Y may 

.actually prove to be curvilinear (Figure 1-b) or linear (Figure 1-rc).. 
for both types of schools or, indeed, different in each (Figure 1-d) , When- 
the observed data appears as in Figure 1-a, there is no way of statis- 

' tically sorting out the t'rue relationship, ' 



Figure 1: Relationships Between Variables in Experimentals 
and Controls: a) actual data. with restricted 
ranges, b-d) possible; relationships with fuller 
range of- variation on X in both experimental" 
and controls. 
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Where the range. problem occurred, an interaction analysis was 
therefore deemed inappropriate since no Confidence could be placed in any 
significant interactions found. In order to have a substantial range 
of variation and thereby lend somewhat more confidence to the analysis, 
the schools were kept combined for a preliminary examination thus, assum- 
ing no interac'tion existed. *' 

The point of concern, however, is still that differences in variables 
between unitized and nonunitized. schools potentially distort the observed 
relationships among the. variables themselves. • In-such. cases , Cohen and 
Cohen (1975.) would contend we pahnot place much confidence in the observed 
relationships among .the two variables or in the stability coefficients of ' 
either when -the- two sets of schools are combined; ' Any analysis should % 
attempt to remove this source of distortion before 'assessing the relation- 
- ships among the .variables or their stabilities.' 

Corrections for Confounding " : .. ' ^ ' ( 

..Decisions , theref ore had/to be made, oh how .to. remove the. distorting 
influence of EXPCON and. whether to remove it from both independent and 



depelidelirvariabW would : be to remove it only 

from the independent variables. This course of action wbuld^uire ^ - 
only regression of the dependent variable' at Tn. onto both the autoregression 
and the independent variable at Tn-1. The nature of the regression proce-',' 
dure would give, the unique influence of, each on the dependent variable 
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controlling for the amount of variance in the dependent variable they 
share together. Since this shared variance is primarily a function of 
the mutual correlation of , all variables with EXPCON, we would be assured 
the analysis controlled for the most part for its distorting influence. 

A more rigid form of control would be to include EXPCON as another 
variable in the regression equation but this would Increase the number 
of variables probably needlessly. This strategy, however, would have to 
be used if we were interested in examining the relationship between say 
collegial decis ion making or .^interdependence and some other variable 

.■ • ••- • \ . ■ • '. 1 " ; ". r. • . 

uncorrected with EXPCON and still wished to control each for EXPCON. => ', 
.This also, would introduce the possibility, that, we would control, for some; 
of the effect in which we were' interested^ however. " ' * * " . 

However, for a path analysis ,^these regression strategies' appear' " 
inappropriate. In a "causal model\ what is in' one place a dependent variable 
can .become an independent variable\i n another. Performing 'the straight- . .' 
forward regressions for the anaiysis\would, in .effect., keep such a 
variable contaminated/distorted^ with 4e experimental-control -differences * 
in level when it is a dependent vari^atle but residual i ze it as. an independent 
variable.' A more appropriate 'strat4>\wo\ild be to f irst re'sidualize all 
Variables by EXPCOlj for waves in wh ich th e\ show the strong- experimental-, 
cohtrol differences.'' •> " ' /" .'. \ »' 

For purposes of a cleaner/ interpretation \f. any single 
regression equation, Cohen .arid Cohen- (1975) contend this is the correct 
strategy.' Since the difference in levels for e^eVimentals and controls. 



confounds relationships we seek, this difference should be removed 
from both independent and dependent variables. x This is accomplished 
most easily by residuali^ng on the EXPCON variable at each wave. 
Cohen and Cohen call this and Analysis of Partial Variance (APV) 
since a portion of variance.af f ected by a confounding variable is removed 
from all contaminated variables and the relationships are- assessed on the 
basis of the remaining portion of variance in each. 
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APPENDIX A: 
Repeated 3 Measures ANOVA and the MITT Data 

Historically, repeated measures designs were developed to examine 
the same unit of observation under several different treatments, conditions, 

or trials. For example, several different ways of learning a certain type 

"" '•/■■' 

, of material or several trials of , rehearsal on a list of words comprised 
the repeated treatment for each' subject. In these cases, the researcher 
must randomize the order of presentation of the treatments and of the 
words in the list." The design usually characterizes experiments concerned 
with accounting for effects due to learning, transfer, and fatigue, A 
main effect for the repeated dimension indicates that, regardless of the 
.order in which the subjects received the treatments or conditions, 
their ^scores consistently increased or decreased. An increase is typically 
interpreted in terms of rehearsal/practice/transfer of learning concepts; 
a decrease is typically interpreted in terms of fatigue/motivational concepts 

The pre-post design common to educational field research and evaluation 
does not really fit this paradigm. For one thing, the levels of the 
repeated dimension do riot coincide with the administration of a treatment 
or condition; the treatment, rather, intervenes between a pair of points 
in time. 

If MITT were to use the repeated measures ANOVA, any main effort found 
for the repeated factor would pose interpretive problems due to this lack 
of correspondence in this aspect of the design models; although it would 
indicate that scores changed (increased or decreased) over 'time, no 
useful concepts like practice, transfer of learning, fatigue or motivation 
exist to which we could reasonably attribute that change. 
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We could not unambiguously ascribe the effect to unitization for 
approximately half the schools since the repeated dimension main effect 
reflects a through-time averaging over both unitized and nonunitized 
schools. If anything, we would expe/ct to find such across- time effects 
due to the unitized-nonunitized distinction showing up in a significant 
interaction; however, the pre 7 post design for the research and the nature 
of the treatment in a study like MITT guarantee a time-by- treatment 
interaction J 

Moreover, the use of the repeated measures model to analyze pre-post 
or any time-to-time data violates assumptions crucial to a repeated 
measures paradigm. One of these is that the correlations among the 
levels of the repeated dimension (trials, treatments, conditions) are j 

the same. With random ordering. of the trials or treatment conditions 

i 

for each subject, there need be concern over the effects of one trial on 
another. The covariation among trials is distributed throughout the sampl 
and, as one of the powerful characteristics of the analysis, is accounted 

•n 

if 

for by partitioning it out of the within-subject variation. In the 
pre-post type of design, the order of the data collection waves cannot be 
randomized among tjie schools; any covariation among them will therefore be 
nonrandom. 

The repeated measures ANOVA falls short also because one of its 
distinctive characteristics lies in its great sensitivity to detecting 
within-subject effects but relative insensitivity to detecting between- 
subject or, in. our case, unitized-nonunitized effects. The covariation 
among trials represents explained variation due to variation within each 
subject and the computational procedure extracts it, thereby reducing the 
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unexplained or residual portion of variance; consequently, the analysis 
reduces the size of the residual term by an amount determined by the 
degree of covariation among the trials. This smaller residual term', as 
a divisor for the F~statistic used to test within-subject effects, is 
smaller than it would be had there been no repeated dimension and will 
increase the chances of finding a significantly large F' value. 
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APPENDIX Bj ~. " - 

Approaches to Analysis of Change > 

We spent some time working .through the Cronbach and Furby (1970) 
article on the measurement of change to see what implications 'it had for 
MITT analyses. This appendix reflects our thinking on their strategy 
in addition to comments on related approaches found in other articles. 

Our assessment was that we proceed in our longitudinal analyses 
without worrying about using their strategy for estimating true scores. 
Given the characteristics of the MITT design and variables we saw 
no guarantee ths.t the Cronbach-Furby method nor any of the other methods 
mentioned in this Appendix would provide us with more accurate and 
unambiguous estimates than we were presently obtaining with multiple 
linear regression. 

\ 

<t ■ 

. I 

The Case Against Change Scores 

According to Cronbach and Furby, change scores, residuals, and base 
free measures should not be used in statistical analyses. They will give 
either the same results as an analyses on the original data or results 
more difficult to interpret. . 

The relationship between change and initial status can be more' 

simply expressed in terms of the relationship between initial and final 
*■ 

status ( B GX =B YX ); the relationship between change and another variable, W, 
is likewise more simply expressed as the .relationship between final status 
and that variable controlling for initial status (B yw ). 



* . - 

Here, G - gain or change, X = initial status, Y = final stat 
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In studying gains as consequences of treatments, one is interested 
in the null hypothesis that experimental show the same effect as controls. 
/ The question, then, is whether true final status (Y^) scores vary from * 
group to group. Experimental and controls can be expected to show some 
measure of change over time in relation to their initial status; although 
the final status may not be a direct consequence of initial status, 
in part it is predictable from it regardless of the particular "treat- 
ment" being imposed. 

The analysis of covariance "takes -this predictable variation into 
account and then compares the deviations of observed scores from the 
prediction between the groups. This is the strategy we. have taken with the 
MITT data. If experimental and controls differ markedly in the deviations 
then the difference is attributed to the experimental-control distinction, 
although this in itself does not explain what it is about the distinction 
that actually produces those difference's. 

Nature of the Problem 

The basic thrust of the Cronbach-Furby article is the same as that 

I ■ ' 

in John Meyer's recommendation* to use Michael Hannan'-s (19? ) longitudinal 

analysis approach. and as that spelled out by Wiley and Hornik (1973) 

— to use all pertinent information to get a handle on measurement error 



^Personal communication 



and thereby derive more accurate estimates of true scores. Their 
proposals essentially describe measurement models for relating true 
variables to their measured values and rely upon assumptions of classical 
test theory. The analysis problem is that measurement errors may have 
large distorting influences in the assessment of relationships -among 
variables if they are not explicitly taken into account. 

Each observed score is considered to be a combination of true 
score plus measurement error. Over all measurements there occurs, 
a distribution of observed scores and of errors. A true score for an 
individual/school is thought of as the average score over a large number 
of repeated measurements of a variable at a particular time point. 



Strategies for Estimating True Scores • 

Correction for Attenuation 
This is the simplest and most straightforward strategy. There are 
two approaches. 

On- involves calculating the correlation between t*ra variables 
that would occur if they were both perfectly reliable. It entails 
using the reliability coefficients in the following formula: 



r 12 = 



12 



= corrected r 
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Any planned regression analyses would then employ these corrected 
correlations. Some controversy exists, however, over the use of such 
corrected correlations, 

(1) One may be fooling oneself into believing that a better 
correlation has been uncovered than the one actually obtained. . Nunnally 
(1967) notes that correlations corrected for ^attenuation seldomly 
differ much -in magnitude from the obtained correlations, 

(2) The correction itself may be poor if the reliability estimates 
themselves are poor. Moreover, since the possibility exists for reversals 
in signs for regression coefficients and partial and part correlations if 
one uses corrected rather than actual correlations, he must have 
confidence in the reliabilities in order to have confidence in the re- 
gression output from perfected correlations. 

(3) Nunnally adds that in prediction type problems it may be inappropriate 
to correct for unreliability in the criterion since the issue is to predict 

or explain scores on that variable as they actually exist not as they 
would exist were the test perfectly reliable. He seems to be particularly 
addressing prediction problems related to selection decisions. 

The second approach involves obtaining estimates of unbiased lf " 
scores because obtained scores tend to be biased, i.e., high scores 
tend to be higher than their true score counterparts' and low scores 
tend to be lower. Conceptually, unbiased scores are those that people 
(schools/units) would obtain if they were administered all possible 
tests having equal numbers of items sampled randomly from the same 
domain — they are estimates of true scores. In the formula below, 



x - (X - X) and t'is a true score. estimate in deviation score units. 
By adding t" to X one obtains an estimated true score. 

Nunnally recommends • that estimating true scores is necessary 
only in. longitudinal analyses where one is interested in contrasting 
the changes between groups. To correct obtained scores O'Connor 
(1972) advocates using the test-retest reliability; if it were unavailable 
then some measure of internal consistency would be satisfactory. 

Cronbach and Furby Method 

These authors extend the' idea of correcting raw scores. Their 

strategy is to estimate true scores for independent and dependent 

variables within experimental and control groups arid then to enter 

, these true scores into regression equations. Their calculations for 

I . - - 

ja true score on any variable employs more information than the instrument's 

reliability coefficient alone. .' 

Let XI and X2 represent time 1 (Tl) and time 2 (T2) scores on 
variable X. Unless the true correlation between the two is zero, both 
XI and X2 contain information about the true score for XI, here indicated 
as Xlt, and both can be used in a regression equation to obtain predicted ■ 
true Xlt scores. 

Information about Xlt from the actual XI scores is reflected 
in the reliability coefficient. From X2 scores it is found in the 
deviations of X2 values from values predicted by the regression of 
X2 on XI within the experimental and control groups separately. 
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They present the following general formula (p. 71), expressed 

here in terms of XI and X2: " 
A 

Xlt - r^Xl + B xlt(xl . x2)(X l-X2) + (1 .. rj(x ) n 



' ' •■ .- • //•' , • . . 

Where X1«X2 is their notation to represent residual scores formed from 
predicting X2 from XI and the final expression ia^an adjustment of the 
mean of the group in terms of the lack of correlation between true and 
obtained scores, i.e. unreliability. Apparently, the reliability 
coefficient used in the equation would also be//calculated within each 

group although the authors do not specifically say so. 

// 

One can pool the groups to obtain single "vithin-group" values 

jj 

for the parameters in the above equation bu;t the estimates will be 
better when calculated separately within groups. This is espacially 

true for grbups not formed randomly because the true score distributions 

/ 

within each tend not to be the same; this implies that>the same 

/ 

observed score, say XI, has a different true score, Xlt, depending 
upon the group. 
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Cronbach and Furtyy go on to say that other relevant Tl variables ' 
should also be includid in the true score estimate, the major limitation * 
.being the sample size used in th^e regressions. Wiley and Harnischf eger 
(1973) qualify that recommendation advocating that such other variables 
should be used only if one can defend, through a causal model, that they are 
theoretically direct determinants of the X2 variable. Their impression is 
that Cronbach and Furby would throw a whole pile of Tl background and 
other variables into the analysis indiscriminately in the hopes of 
reducing error. 

Cronbach and Furby provide a method of calculating true score variances 
and then inputing these into regressions rather than using raw scores. ' 
They also distinguish between linked a*id . unlinked T1-T2 measures and 
furnish adjustments that need to be made respectively. 

Although my major concern was initially with the Cronbach-Furby 
approach, I want to discuss two others with the sanjfe underlying focus. 

Wiley and Hornik Method 
Wiley and Hornik (1973), to get accurate true score estimates, 
developed a measurement model to deal with errors in. panel data but it 
relies on having more than one measure of each variable at each point 
in time. The two measures for any single variable will reflect the same 
true score variance but different error variances. Calling classical test 
theory into play, they make certain assumptions about the independence 
and additivity of variance components. Their use of cross-time and 
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alternate observed measures allows for the estimation of unobserved 
true and error variance in the variables. Once the true variances and \ 
covarlances of the variables have been calculated they can be used to 

compute regression weights for the relationships among variables over timl 

• 1 

Hannan and Othe rs 

. • Wiley and Hornik refer to more "optimal" methods for using multi- 
wave longitudinal data. They are more optimal in the sense that they / 

W / 

reduce the standard error around the estimates because the calculations / 
involved employ more of the pieces of variance information that are 
available. 

The references they cite, particularly for the relevant computer 
program, are the same as those John Meyer gave me as he talked about 
confirmatory factor 1 analysis . .Meyer's .suggestion was to use it only if 
we found statistical significance with a technique developed by Michael 
Hannan and Alice Young (n.d.). Their method was an attempt to pool 
variance information about the variables of interest across all waves 
in an-attempt to get a more accurate handle on statistical significance; 
like others,' their intent was to reduce the amount of error variance 
to get an accurate estimate of true score relationships, The Hannan-Young 
metrnod and confirmatory factor analysis require complicated estimation 
procedures which cannot be performed with least squares regression 
analyses. 
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Problems with the Strategies 

MJTT's concern with the Hannan-Young and Confirmatory Factor 
Analysis techniques was partly in their questionable appropriateness 
to the MITT design. Hannan and Young advocate a particular model 
whose use is constrained by several assumptions inVhe use^s data 
(most of which we found difficult tc grasp) and whiclSrtself has not been 
well tested. Their report is only one which does appear to lend support 
to the model's utility, but under what circumstances I'm not sure. 

The Wiley-Hofnik method is simpler to use but does not seem to fit 
our design either. Furthermore, for a study of MITT 's magnitude the • 
required calculations appear quite laborious. 

Of any of the methods, the Cronbach-Furby and the Correction for 1 
Attenuation seem the- most straightforward at first glance. Yet, certain 
characteristics of the MITT study leave^hese open to question also. 

Before discussing them, it may first be worth noting that the 
variability of the sample und6r study affects the reliability ~ one reascn 
for requiring large random samples in reliability studies. One way of 
expr^«.ing the coefficient is as follows: 
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The variance of the errors of measurement ( °mea& ) Is considered to 

be approximately independent of the variance of the obtained scores 

( °X ) ank is therefore conceptually regarded as a fixed characteristic 

of the instrument regardless of the sample being studied. As the variance 

\ \ 2 . 

of the sampleWreases then, the ratio °meas .will decrease and the 

reliability will increase; as the variance decreases the ratiu will 

increase and the reliability decrease. 

i 

Normally (1967) notes that a low reliability for an instrument 
will make detection of statistical significance difficult; when this 
standard error of measurement and. the standard deviation of the variable 
in the sample are approximately equel he claims it is hopeless to in- 
vestigate the variable. 

My concerns center around four major points; »i 
(1) I question the suitability of the formulas for MITT data; 
at least, I am not sure the models were developed with our type of 
instrumentation and unit of analysis problems in mind. The Cronbach- • 
Furby and the. Attenuation Correction methods have their focus on 
research using tests of abilities, I.Q., and personality traits which 
themselves have, a tradition of being extensively researched for 
reliability and validity on large random samples of subjects. Conse- 
quently, the reliabilities of such measures are generally acknowledged ^ 
as being stable and accurate and not affected by the variances cf 
research samples in which they are used. 
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• MITT deals with different phenomena; many of the types of variables, 
involve opinions about others, perceptions about the school, some 
perceptions of self, and descriptions of behavior. None of the instruments 
received the extensive attention that trait arid ability measures have 
traditionally received nor were their reliabilities calculated on exceedingly 
large random samples. This is not to say they are no good; it just 
questions the amount of confidence we can place in the accuracy of the 
reliabilities that would be used to estimate true scores. Questionable 
reliabilities would guarantee us nothing much better than questionable 
true score estimates. 

(2) The accuracy of x the reliabilities relates to a more difficult 
problem: How is a reliability to be computed? > 

a) If it is to be computed for each group (experimental 
vs. control) then it will change^in accord with differences 
in the variance of each group even though a reliability , like 
the standard error of measurement, is conceptually a characteristic 
^t^axi^ismtjumex^independent of any sample of subjects. 

,b) Since we are involved in a school level analysis can we 
justifiably use reliabilities based on individuals when our unit 
of analysis is the school? This is part of an aggregation problem 
discussed by Hannan (1971) which maintains that aggregated scores 
measure ,a theoretically different variable than the unaggregated 
scores; perhaps, reliabilities should be based on school scores 
rather than individual scores. 

(3) The Cronbach-Furby method includes a term consisting of 
residuals, deviations of X2 from predicted XI spores as Calculated 
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within each group. The logic of using posttest to predict pretest is 
that, 

Within a treatment, persons higher on the posttest 
than others having the same observed pretest score 
tend to be those for whom the true, pretest score is 
higher than the observed score, (p. 72.) 

This may be applicable for ability and trait measures but I'm 
not sure how accurately it models a lot of relationships with the 
MITT measures. For example, we may find that an experimental school 
has more classroom communication than another experimental school but 
that ,both have the same amount of communication at the previous wave. * 
The contention of Cronbach and Furby is that the first school should 
\ theoretically have a higher communication level at the previous wave 
also, 

(4) Finally, whether one employs variances or raw score regressions 
to compute true scores according to^the Cronbach-Furby approach, he still 
faces the prospect of error in the variances due to small sample size. 
This is especially the case if true scores for variables must be 
estimated within each group. . 

In summary, we found no guarantee that any of the ' methods above 
would provide us*accurate estimates of true scores. The arguments 

for use of the methods make sense but we were not sure how accurately 

\ 

the models for handling error reflect the characteristics of our 

' * ' - \ 

design and variables. We do think a study to employ the models on 
the MITT data is something that could be written\as, a proposal itself. 
My recommendation at this point is to proceed as we have been. 

; . . \ - ' \ 
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APPENDIX C: 
Rudiments of Path Analysis 



Although path analysis cannot prove causality it can lend more or 
less confidence to a postulated model that describes the relationships one 
expects to find among his variables. The crucial, and probably most 
beneficial, aspect of the method is that if requires a clear specification 
of the causal model: the more ambiguous the variables and their relation- 
ships, the less confidence one can place in the analysis (See Appendix C ). 

The basic inferential tool crucial to path analysis is multiple 
linear regression which allows one to examine the magnitudes and directions 
of "direct'; effects and their statistical significance while controlling 
for mutual influences among independent variables. The hypothesized 
causal model itself can be represented by a set of multiple linear 
regression equations. 

Because variables in behavioral science research are often expressed 
in arbitrary scales, not much substantive information about a path analytic 
model is conveyed by non-standardized regression weights, which specify 
that a 1.0-point change in the independent variable causes b points change 
in the dependent variable. This is because the different scale ranges 
of the independent variables obscure the importance, of different 
variables relative to one another when the nonstandardized b-weights are 



used. 
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For this reason the standardized regression weights', B's or betas, 
are used as path coefficients tjo represent the direc/ef f ect of independent 
on dependent variables. ' Each coefficient estimated the amount of change 
-in standard deviation units of/ the dependent variable that is produce! 
by a 1-standard deviation change in the respective independent variable 
(Amick and Walberg, 1975)/ / \ 

■ - / .. N / . ' " V ' 

The use of multiple linear regression in path analysis focuses | 
upon explanation rath/r than classical prediction. The//investigator| 
desires not only to explain a substantial proportion of variance in | 
the dependent variables but also to assess the relative importance o'f 
theoretically relevant independent variables. J ■ I 

/ • . ■ ■ ■ ! ■ . 

, The first step in formally analyzing data by path analysis is to 

explicitly specify a presumed unidirectional (recursive) causal ordering 

among the set of variables of interest. This model purports that the 

correlation between any two variables, except for those not causally 

determined by any other variables in the model, can be decomposed into 

a term representing the direct effect of one on the other plus a series 

of other terms representing the indirect effects. .The indirect effects 

reflect portions of the correlation explained by spurious and/or 

> 

mediated relationships. 

The following diagram represents an example of a causal model \ 
with variables ordered from 1 to 4. t 




Variables ffl and //2 have no hypothesized causal, determinants among the 
selected variables and therefore, the numbers signify only that they 
are different and not that one causally precedes the other. The two- 
headed arrow; between them indicates that we cannot analyze their corre- 
lation. All other variables do have some hypothesized causes. 

The diagram depicts ,wha^: is called a recursive model because the 
causal flow is in one directjAn. Single-headed arrows between variables 
represent direct effects. N^tic^that some variables may each act as 
an independent variable and 'also as a dependent variable with respect 
to a subset of other variables in the model. Multiple-step paths 
showing variables acting through other variables to influence a 
dependent variable represent indirect effects. For example, the 
correlation between variables if 2 and #4 is accounted for by a direct 
effect of #2 on #4 and a mediated effect of #2 acting through #3 which 
in turn affects //4. Because it is impossible to explain the total 
variation in any dependent variable completely by the designated 
independent variables, the residual variable is needed as a catch-all 
to account for all variance unexplained by the variables under scrutiny 
(Kerlinger and Pedhazur, 1973; Namboodiri et_. al. , 1975). 
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APPENDIX D; 
Ai:tocorrelations and Changes in Means 

Usually, a longitudinal analysis will examine means at each wave 
to determine whether or not change occurs. In conjunction with them, the 
the autocorrelations can provide some useful information about the 
variation in variables and prevent hasty generalizations of the through-time 
trend in group means to each school comprising the group. If means 
increase or decrease one often tends to infer that the level of the 
variable in each school does so also; if the mean level of the 
variable remains the same,. one may similarly infer that nothing 
has changed in the schools. But, the change in means is an index of 
group tendency not individual school variation. The group mean may show 
no change from one time to the next even though individual schools change 
drastically. The group mean may show a drastic jump or decrease even 
though some school^feither do not change or change in the opposite 
direction. 

The autocorrelation of a variable between two time points, however, 
can provide information that would: help confirm or caution such a 
generalization. If the mean level changed but the autocorrelation were 
low, one would hesitate to generalize the trend to the majority of 
schools in the sample. If the mean levels changed and the autocorrelation 
were high, one would have confidence in generalizing. If the mean level 
remained the same and the autocorrelation were high, one would confidently 
generalize that the schools tended not to change. 
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